Search CORE

1,115 research outputs found

Implicit Filter Sparsification In Convolutional Neural Networks

Author: Kim Kwang In
Mehta Dushyant
Theobalt Christian
Publication venue
Publication date: 01/01/2019
Field of study

We show implicit filter level sparsity manifests in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization or weight decay. Through an extensive empirical study (Mehta et al., 2019) we hypothesize the mechanism behind the sparsification process, and find surprising links to certain filter sparsification heuristics proposed in literature. Emergence of, and the subsequent pruning of selective features is observed to be one of the contributing mechanisms, leading to feature sparsity at par or better than certain explicit sparsification / pruning approaches. In this workshop article we summarize our findings, and point out corollaries of selective-featurepenalization which could also be employed as heuristics for filter pruningComment: ODML-CDNNR 2019 (ICML'19 workshop) extended abstract of the CVPR 2019 paper "On Implicit Filter Level Sparsity in Convolutional Neural Networks, Mehta et al." (arXiv:1811.12495

arXiv.org e-Print Archive

MPG.PuRe

Example-based learning for single-image super-resolution and JPEG artifact removal

Author: Kim Kwang In
Kwon Younghee
Publication venue: Max Planck Institute for Biological Cybernetics
Publication date: 01/08/2008
Field of study

This paper proposes a framework for single-image super-resolution and JPEG artifact removal. The underlying idea is to learn a map from input low-quality images (suitably preprocessed low-resolution or JPEG encoded images) to target high-quality images based on example pairs of input and output images. To retain the complexity of the resulting learning problem at a moderate level, a patch-based approach is taken such that kernel ridge regression (KRR) scans the input image with a small window (patch) and produces a patchvalued output for each output pixel location. These constitute a set of candidate images each of which reflects different local information. An image output is then obtained as a convex combination of candidates for each pixel based on estimated confidences of candidates. To reduce the time complexity of training and testing for KRR, a sparse solution is found by combining the ideas of kernel matching pursuit and gradient descent. As a regularized solution, KRR leads to a better generalization than simply storing the examples as it has been done in existing example-based super-resolution algorithms and results in much less noisy images. However, this may introduce blurring and ringing artifacts around major edges as sharp changes are penalized severely. A prior model of a generic image class which takes into account the discontinuity property of images is adopted to resolve this problem. Comparison with existing super-resolution and JPEG artifact removal methods shows the effectiveness of the proposed method. Furthermore, the proposed method is generic in that it has the potential to be applied to many other image enhancement applications

Lancaster E-Prints

MPG.PuRe

Active Learning Guided by Efficient Surrogate Learners

Author: An Yunpyo
Kim Kwang In
Park Suyeong
Publication venue
Publication date: 17/12/2023
Field of study

Re-training a deep learning model each time a single data point receives a new label is impractical due to the inherent complexity of the training process. Consequently, existing active learning (AL) algorithms tend to adopt a batch-based approach where, during each AL iteration, a set of data points is collectively chosen for annotation. However, this strategy frequently leads to redundant sampling, ultimately eroding the efficacy of the labeling procedure. In this paper, we introduce a new AL algorithm that harnesses the power of a Gaussian process surrogate in conjunction with the neural network principal learner. Our proposed model adeptly updates the surrogate learner for every new data instance, enabling it to emulate and capitalize on the continuous learning dynamics of the neural network without necessitating a complete retraining of the principal model for each individual label. Experiments on four benchmark datasets demonstrate that this approach yields significant enhancements, either rivaling or aligning with the performance of state-of-the-art techniques

arXiv.org e-Print Archive

Cyclohexane-1,2-diammonium bis(pyridine-2-carboxylate)

Author: Ha Kwang
Hwang In-Chul
Kim Nam-Ho
Publication venue: International Union of Crystallography
Publication date: 01/10/2009
Field of study

In the dication of the title salt, C6H16N2 2+·2C6H4NO2 −, the two ammonium groups are in the equatorial positions of the chair-shaped cyclohexyl ring. In the crystal, the cations and anions are linked by N—H⋯O and N—H⋯N hydrogen bonds, forming a layer network parallel to the ac plane. Weak π–π interactions between adjacent pyridine rings with a centroid–centroid distance of 3.589 (2) Å are also present

Crossref

Directory of Open Access Journals

PubMed Central

Bis(2,2′-bipyridine-κ2 N,N′)dichloridoplatinum(IV) dichloride monohydrate

Author: Ha Kwang
Hwang In-Chul
Kim Nam-Ho
Publication venue: International Union of Crystallography
Publication date: 01/02/2009
Field of study

In the title complex, [PtCl2(C10H8N2)2]Cl2·H2O, the Pt4+ ion is six-coordinated in a distorted octahedral environment by four N atoms from the two 2,2′-bipyridine ligands and two Cl atoms. As a result of the different trans influences of the N and Cl atoms, the Pt—N bonds trans to the Cl atom are slightly longer than those trans to the N atom. The compound displays intermolecular hydrogen bonding between the water molecule and the Cl anions. There are intermolecular π–π interactions between adjacent pyridine rings, with a centroid–centroid distance of 3.962 Å

Directory of Open Access Journals

PubMed Central

BoIR: Box-Supervised Instance Representation for Multi-Person Pose Estimation

Author: Baek Seungryul
Chang Hyung Jin
Jeong Uyoung
Kim Kwang In
Publication venue
Publication date: 25/09/2023
Field of study

Single-stage multi-person human pose estimation (MPPE) methods have shown great performance improvements, but existing methods fail to disentangle features by individual instances under crowded scenes. In this paper, we propose a bounding box-level instance representation learning called BoIR, which simultaneously solves instance detection, instance disentanglement, and instance-keypoint association problems. Our new instance embedding loss provides a learning signal on the entire area of the image with bounding box annotations, achieving globally consistent and disentangled instance representation. Our method exploits multi-task learning of bottom-up keypoint estimation, bounding box regression, and contrastive instance embedding learning, without additional computational cost during inference. BoIR is effective for crowded scenes, outperforming state-of-the-art on COCO val (0.8 AP), COCO test-dev (0.5 AP), CrowdPose (4.9 AP), and OCHuman (3.5 AP). Code will be available at https://github.com/uyoung-jeong/BoIRComment: Accepted to BMVC 2023, 19 pages including the appendix, 6 figures, 7 table

arXiv.org e-Print Archive

RGBD-Dog: Predicting Canine Pose from RGBD Sensors

Author: Cosker Darren
Kearney Sinead
Kim Kwang In
Li Wenbin
Parsons Martin
Publication venue: IEEE
Publication date: 31/12/2020
Field of study

The automatic extraction of animal 3D pose from images without markers is of interest in a range of scientific fields. Most work to date predicts animal pose from RGB images, based on 2D labelling of joint positions. However, due to the difficult nature of obtaining training data, no ground truth dataset of 3D animal motion is available to quantitatively evaluate these approaches. In addition, a lack of 3D animal pose data also makes it difficult to train 3D pose-prediction methods in a similar manner to the popular field of body-pose prediction. In our work, we focus on the problem of 3D canine pose estimation from RGBD images, recording a diverse range of dog breeds with several Microsoft Kinect v2s, simultaneously obtaining the 3D ground truth skeleton via a motion capture system. We generate a dataset of synthetic RGBD images from this data. A stacked hourglass network is trained to predict 3D joint locations, which is then constrained using prior models of shape and pose. We evaluate our model on both synthetic and real RGBD images and compare our results to previously published work fitting canine models to images. Finally, despite our training set consisting only of dog data, visual inspection implies that our network can produce good predictions for images of other quadrupeds – e.g. horses or cats – when their pose is similar to that contained in our training set

OPUS